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A* Development of the Pro.iect . 

The initial plan of the project was prompted^ by a series pt 
research publications on current examinations and methods of assess- 
ment in high schools. These studies emphasized the inadequacies of 
the evaluation methods with respect to reliability, distribution of 
scores and standardization. Surveys of school failures and drop- 
outs, published at about the same time, showed serious bottlenecks 
in the secondary school system (see list of references on page )• 

There is a close relation between these issues, since success 
and failure in school, among other things, depend on the evaluation 
methods employed by the teachers, and on the availability of guid- 
ance aided by proper instruments. 

Adequate assessment methods are also essential for adaptation 
of curriculum to the level of students and their rate of progress. 
Since the structure of assessment instruments should conform to these 
purported objectives, it is important to keep one main objective in 
the foreground. There is consensus of opinion among educational 
experts emphasizing guidance as the most urgent task. 

Since standardized intelligence tests are already available in 
Hebrew, it was decided to focus on educational attainment. Owing to 
the diversity of curricula in the lower grades of the high school, 
it seemed preferable to construct a test battery to measure general 
educational development rather than strict achievement tests. Also, 
from the guidance point of view, the general level of a student is 
far more important than specific items of information. 

The well-known batteries' I TED (Iowa Tests of Educational 
Development) and S T E P (Sequential Tests of Educational Progress) 
served as a model for the present battery; 

"...THE TESTS ABE DESIGNED TO MEASURE MUCH MORE THE GENERALIZED 
SKILLS. THEY ARE INTENDED TO. MEASURE THE PUPIL’S ABILITY TO DO 
CRITICAL THINKING IN THE BROADT AREAS DESIGNATED; THEY ARE CONCERNED 





NOT SO MUCH ¥ITH VHAT THE PUPIL HAS LEARNED, IN THE SENSE OP SPECIFIC 
INFORMATION, BUT RATHER WITH HOW WELL HE CAN USE WHATEVER HE HAS 
LEARNED IN ACQUIRING, INTERPRETING AND EVALUATING NEW IDEAS, IN 
RELATING NEW IDEAS TO OLD, AND IN APPLYING BROAD CONCEPTS AND 
GENERALIZATIONS TO NEW SITUATIONS ON THE SOLUTION OP NEW PROBLEMS. 

THESE ARE SOME OF THE OUTCOMES NOT ONLY OP AN EFFECTIVE COURSE OP 
FORMAL SCHOOL INSTRUCTION, BUT ALSO OP ANY OTHER GENUINELY EDUCATIONAL 
EXPERIENCE, WHETHER FORMAL OR INFORMAL, DIRECT OR INCIDENTAL, IN-SCHOOL 
OR OUT-OP-SCHOOL...” 

(From Manual for Teachers and Counsellors, I T E D.) 

Description of the Battery . 

The battery comprises 6 subtests: 

1) Mathematics 

2) Science (Physics, Chemistry, Biology) 

3) Reading Comprehension - literature. 

4) Reading Comprehension - social studies. 

5) Social Studies - general information. 

6) English Language. 

The tests differ in their relative dependence on the school 
curriculum material. In the Sciences (l and 2 above) and English 
Language (6) the connection is relatively close. While in the 
humanistic studies (3, 4 and 5) connection to the school material is, 
relatively, loose and the factors of general knowledge and the 
comprehension of basic concepts are pre-eminently emphasized. Clearly 
the weight of emphasis on school ino-terial, on the one hand, and on 
general knowledge, on the other, is not equal in every item of the 

test. 

Tests included in the battery: 

1) Mathematics - 24 questions in Geometry and Algebra. 

2) Science - 33 questions in Physics, Chemistry and Biology. 



The questions are related to the material contained within the 
school curriculum, however a pupil's score depends also on the extent 

of his extracurricular reading. 

3) Reading Comprehension - literature . The test examines 
comprehension of extracts culled from literature. Six extracts are 
given, each followed by a number of questions. The total number of 

questions is 20. 

4) Reading Comprehension - social studies. The test includes 5 
extracts from soc’ial studies with a total of 22 questions. 

5) Social Studies - general information. 22 questions on 
History, Jewish History, Economics, Civics and Geography. The 
questions are, in part, drawn from the school curriculum material 
while the rest depend largely oh the general knowledge of the pupill. 

6) English Language. 55 questions covering vocabulary, grammar 

and reading comprehension. 

The test items are arranged in order of difficulty. 

B. Statistical ProT)erties. 



1. 


The Samnleo 


2. 


Reliabilitv. 


3. 


Validity. 


4, 


Intercorrelations . 


5, 


Time-snan. 


The 


Sample . 



The sample used as a basis for the computation of norms is a 

national sample of the 9th and 10th grades of accredited academic 

high schools. The statistical data reported here are based on this 
sample. 

Literary extracts taken from: 

The Haggadah; S.H. Bergman; L. Goldberg; A. Kovner; G. Shofman;. 

Y* Shcaiberg. 
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The sample, is s'bra'bif ied and from each s’bva'bum a random sample is 
chosen. The stratification vas based on: 

(i) Type of settlement wherein the school is placed (big cities, 
various sized towns and other kinds of settlements) 

(ii) The scholastic level of the school. Since there is no 

accepted uniform measurement of the scholastic standard of 
a school, we used the percentage of successful matriculation 
candidates in the preceding year, ;1965. 

The sample comprises 370 pupils in the 9th grade ('ll classes) and 
370 pupils in the 10th grade (13 classes). 

t 

The tests were given during the last trimester of the 1966 school 

year. 

2* Reliability . 

Since no parallel test is available, we employed the split half 
method of computation of reliability, dividing each test into odd and 
even items. The results are given in Table I. 



TABLE I. 

^ — * 

Reliability of the Subtests and of the Total Batterjv 



Test 


m 


Grade 9 


Grade 10 


Mathematics 


• 


.75 


.72 


Science 




.68 


.71 


Beading Comprehension - 


literature 


.57 


.65 


Reading Comprehension - 


social studies 


.53 


.60 


Social Studies - general 


information 


.69 


• 66 


English Language 




.85 


.87 


Total Battery 




.90 


.91 



The highest reliability is in the English Language test (.85 — .87) • 
This is^ possibly, because it is the longest test and reliability increases 
with length. The reliabilities of the tests in the Sciences (.68 - . 75 ) are 
higher than those in the humanistic studies (.53 - .69). 

Corrected by the Spearman-Brown Formula. 

» 

o 

ERIC 



-6 



The reliability of the subtests is not high enough for a 
construction of profiles of the individual student. This is due to 
their brevity and heterogeneity of content. Higher reliability could 
have been obtained had the subtests been constructed of a homogeneous 
pool of items; however, this would confute the purpose of global 
measurement. We considered it preferable to include a great number 
of areas at the expense of reliability. Reliability could be raised 
by increase of length, but this would require a longer time span of 
testing, which would restrict the applicability of the battery. 

Use of the mean score of the subtests is therefore recommended. 
It is based on 176 items. Its coverage is very broad and the 
reliability (,90) is high enough for individual assessment, (Mean 
score instead of total score is employed, to allow for cases of 
omission of a subtest). 

Employment of raw scores weighs the tests unequally, but the 
departure from equality is small as can be seen from Table II, 



TABLE II . 

Correlations of the Subtest Scores with the Total Scores 



Subtest 

Mathematics 

Science 

Reading Comprehension ~ literature 
Reading Comprehension - social studies 
Social Studies - general information 
English Language 



Grade 9 


Grade 10 


.68 


.65 


.69 


.71 


.60 


.63 


.65 


.68 


.69 


.59 


.75 


.76 



The range of the relative weights, i,e,, the ratios of the 
correlations is between 1,0 and 1,3, The tests may be considered 
as equally weighted. The largest weight is of the English test 
which is only slightly larger than the others (1*3 in both grades). 
This slight difference does not justify the additional work involved 
in application of weighted averages. 



3. 



Validity. 



Methods of determining validity are quite complex since there 
is rarely a single, unequivocal criterion by which to measure validity. 
In fact, validity inay be determined only on the basis of protracted 
research and systematic exploratioq, of relations between the tests and 

practical and iliGorctical critGriae 

The following evaluation is based on data collected within the 
scope of the present project. Further information will be collected 
in due course of time in the context of a follow-up study. 

The content validity of the battery was established by judgment 
of teachers, school supervisors, and experts on the subject matter of 

each subtest respectively. 

Concurrent validity with respect to academic achievement is baaed 
on correlations between the battery score and point-grade averages of 
21 classes. The median correlation is .50. The interquartile range 
is .32. These values underestimate the validity because of the 
restriction of range in each class. This becomes evident when we note 
the large difference in scores among schools. However, computation 
of the corre lation between the battery scores and school grades for 
the whole sample would involve standardization of school grades on one 

consis’ten'b scale. 

The validity with respect to prediction of academic achievement, 
success and failure in school, dropping-out and effectiveness of the 
tests for adaptive treatment and placement purposes, requires a 

fellow-up study. 
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4. IntToorrtlationg. 

The intercorrelations, means and standard deviations of the 
sub-tests are given in Tables III and IV. The entries in the diagonal 
are the split-half reliabilities corrected by the Spearman-Brovn 
Formula. The entries above the diagonal are the raw correlations. 
Belov the diagonal are the correlations corrected for attenuation. 



TABLE III . 

Intercorrelations, means and standard deviations 
of the subtests for gr ade 9. 



(N » 370) 















- 


1 

Subtest Mathe- 

matics 


2 

Science 


3 

Reading 
Comp. - 
Litera- 
ture 


4 

Reading 
Comp. - 
social 


5 

Social 
Studies - 
general 
information 


6 

Snglish 

Language 


1 Mathematics 


(.75) 


.52 


.27 


.30 


.45 


.32 


2 Science 


.73 


(.68) 


.29 


.37 


.47 


.29 


3 Reading 

Comprehension- 

literature 


.42 


.47 


(.57) 


.47 


.37 


.35 


4 Reading 

Comprehension- 
social studies 


.48 


.62 


.85 


(.53) 


.41 


.37 


5. Social studies- 
general 
information 


.63 


.68 


.59 


.68 


(.69) 


.28 


6 English 
Language 


.40 


.38 


.50 


.55 


1 .36 


(.85) 


Mean 13.6 


17.5 


13.6 


13.6 


12.6 


33.4 


Standard 

Deviation 


4.0 


4.3 


2.8 


3.1 


1 


7.8 
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tjlble iy. 



Iniorcorrelations, means and standard deviations 
of the sub- ^efftn for grade 10. 



Subtest 



1 

Mathe- 

matics 



2 

Scienc< 



1 Mathematics 

2 Science 

3 Reading 
Comprehension - 
literature 

4 Reading 
Comprehension - 
social studies 

5 Social Studies - 
general information 

6 English Language 



(e72) 

e68 



e46 



e55 



Mean 



15e3 



Standard Deviation 4e0 



e49 

(e7l) 



e43 



e57 



Reading 
Comp. - 
liter- 
ature 


Reading 
Comp. - 
social 
studies 


Social 

Studies* 

general 

inform- 

ation 


Elfish 

-Language 


.31 


^36 1 


.37 


.25 


.29 


.37 


.32 


.38 


(.65) 


.51 


.36 


.35 


.82 


(.60) 


.36 


.40 


.55 


.57 


(.66) 


.21 


.47 


.56 


.28 


(.87) 


L4.0 

t 


14.3 


13.4 


39.9 


3.2 


3.0 


4.0 


8.0 



The highest correlations (see correlations below the diagonal) 
are between: 

1) Reading Comprehension tests - literature and social studies (.82-. 85). 
This is due to the similarity of function and content. 

2) Mathematics and Science (.68-. 73). The two science disciplines in 
the battery. 

other correlations are lower and the lowest are tound in English; 
however, a thorough investigation of the composition of the battery 
would re<|[uire a factor analytic study. 



^ ERJC 
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5. T^ma-«pan « 

Part E gives details of the times allowed for each test. These 
time-spans were determined in the pre-test. It is ou.tomary to allow 
855t of the examinees to finish the test. This, too, was our basis 

for deciding on the time-span for each subtest. 

The tests, therefore, depend little on speed of performance, 
has been found that additional time does not affect the score appreci- 
ably. However, the tester should be precise about the time-span since 
the statistical data and the norms are computed according to the time 

limits given in the instructions* 

Cm ATonlice,tio»iff Battery. 

This battery is designed to be used as an aid in counseling 
guidance for pupils in the 9th and 10th grades of academic high schools. 
The norms given in Table VII allow the calculation of the compare v* 

level of the student with the national standard. 

Knowing the levels of students is particularly important when 
dealing with youngsters in need of counseling and guidance, idien, 
for example, they are on the threshold of dropout or when there is a 
questionable situation around graduating a certain pupil to a higher 
class. Class grades and the evaluation of teachers who have known e 

pupil over a year or two are, of course, consequential factors in such 
situations, however, an accurate measuring instrument standardized on 
national norms would serve to elucidate the picture of the problems i 

student. 

It is often possible to prevent dropout from the educational 
framework by channelling the pupil to an educational environment more 
suited to his level. We emphasize here that this battery is no ye 
complete and is inadequate in its present form to function as a basis 
for the removal of a pupil from an aademic high school to a non- 
academic one. Such a step involves the investigation of specific 
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abilities and an analysis of personality traits. The results of the 
battery provide only one factor of the many which have to be considered. 
Problematic cases should be dealt with by a counseling psychologist or 
by a vocational training centre. 

From a psychological point of view, it is particularly interesting 
to note cases of significant discrepancies between test scores and 
school grades. Some of these discrepancies may be put down to random 
deviation stemming from inaccuracies in measurement. The results of 
any specific test might have been influenced by disturbed concentration 
or overtenseness. On the other hand, they might also have been 
influenced by a lucky guess or the appearance of well-known items in 
the test. The same applies to teacher evaluations, whose accuracy 
leaves much to be desired. 

Significant discrepancies between the battery score and school 
grades usually indicate adjustment difficulties within the school 
framework, whether related to motivation or personality conflicts. 

These youngsters are in need of special attention from the school s 
counselling psychologist. 

In addition to problematic children, the results of the battery 

may also be applied as a class scale. 

Pupils may be graded according to the raw scores they achieved 
in the battery but it is easier to comprehend the significance of 
the scores when they are translated into norms (see Table VIl). 

The norms bring the scores onto a scale of national standards. 

The most accurate evaluation of the pupil's place in his class 
is based on the norm of his mean score in the battery and on the 
grade given by the teacher. The method for computing norms is 
detailed in part D. 

The class as a whole can be graded on the basis of a national 
sample in the following way; 

The mean score of the class is determined from the mean score 
of the pupils in each subtest. The data given in Tables V and VI 



- 12 - 



allov the teacher to locate the position of his class relative to the 
distribution of a national sample of class means. For example , if the 
mean score of a certain 9th grade is 13.3 in the mathematics subtest, 
this signifies that the class belongs to the 3rd quarter of the 
national sample in mathematics. A 10th grade with a mean score of 
13.1 in literature would belong to the 2nd quarter of the national 

sample, and so on. 

We have already noted that the score achieved in ona test is not 
a reliable enough* ^asis for the construction of individual profiles. 
The mean score of a class, however, is more stable since random 
deviations of measurement errors cancel themselves out and it is 
possible to compute the class level in each subtest separately. 



TABLE V . 

Interquartile ranges of the distribution of class means 
of each subtest for grade 9» 



Subtest 

class 


Mathe-f 

ma- 

tics 


Science 


Reading 

Comp.- 

lit. 


Reading 

Comp.- 

social 

studies 


Social 
studiea- 
general 
info . 


English 

Language 


Mean j 
score j 
in bat- 
tery 


Up to first 
quartile 

First quartile 
to median 

Median to 
third quartile 

Fourth quartile 
onwards 


to 

12.3 

12.4- 

13ol 

13.2- 

14o9 

15.0- 


to 

15.7 

15.8- 

17.0 

17.1- 

19.6 

19.7- 


to 

12.7 

12.8- 

13.7 

13.8- 

14.3 

14.4- 


to 

12.7 

12.8- 

13.9 

14.0- 

14.2 

14.3- 


to 

10.3 

10.4- 

12.2 

12.3- 

14.0 

14.1- 


to 

29.8 

29.9- 

33.6 

33.7- 

35.6 

35.7- 


to 

16.4 

16.5- 

17.1 

17.2- 

18.3 

18.4- 



o 
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TABLE VI. 



Interquartile ranges of the distribution of class means 



of each 



rrade 10 » 



Subtest 



class 

groups 



Up to first 
quartile 

First quartile 
to median 

Median to 



Fourth quartile 
bnvards 



Mathe- 1 
ma- |i 

tics 1 


Science 


Readingl 

Comp.- 

lit. 


Readingl 
Comp.- 
social 
studies 1 


Social English 
s t ud i e s-jL anguag e 
general 1 

info. 1 


Mean 
score 
in bat- 
terv 


to 1 

Il3.2 


to 

1 16.6 


to 1 

13.0 


to 1 

13.0 


to 

12.7 


to 

35.5 


to 

18.1 


13.3- 

16.0 


16.7- 

17.8 


13.1- 
14.1 1 


113.1- 
13.8 I 


112.8- 

14.2 


1 35.6- 
38.8 


18.2- 

19.2 


ll6.1- 

16.3 


17.9- 

21.0 


14.2- 

14.3 


13.9- 

14.9 


14.3- 

14.4 


38.9- 

44.3 


19.3- 

21.2 


ll6,4- 


21.1- 


14.4- 


15.0- 


14.5- 


U4.4- 


21.3- 



D, Distribution and Norms^ . 

The distribution of scores of the battery in grades nine and ten are 
given in figure A. 

The norms are based on a sample of classes, grade nine and grade ten 
in accredited academic high schools. The sampling procedure was based on 
a stratification of schools by type of settlement and level of school 
(based on the results of matriculation examinations). The total number 
of cases was 370 from each grade. The Sample covered 21 classes. The 

data were collected during the final trimester pf 1966. 

The means of the distributions are 16.9 and 18.8 in grades nine and 
ten respectively. The standard deviations are 3.3 and 3.9 in grades 
nine and ten respectively. 

The large overlap of the distributions shows that individual 
differences far exceed the differences between grade levels. Similar 
results can be seen by comparison of class means. For example, on the 
Social Science General Information subtest the means of the median 
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class of grade nine and grade ten are 12.? and 14.3 respectively, the 
difference being 2.0, while the ranges of the class means of grade nine 
and of grade ten are 7.5 and 5.3 respectively. Similar results occur 

in the other subtests. 

This is an expected outcome especially for a battery measuring 
general educational development, for attainment here depends on the 
background of the pupils, methods of instruction, extent of individual 
•reatmen-j!, encouragement of extracurricular activities, and the select- 
ive policies of various schools rather than on progress during one year 
of high school. However, the available data are not sufficient for a 

thorough analysis of the factors involved. 

For guidance purposes it is necessary to extend the norms in order 
to cover vocational schools and agricultural schools. This will be 
don#" in conjunction with the forthcoming validation and follow-up study. 



o 
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Figure A 

Distribution of scores (average of six 
subtests) in grade nine and in grade 
ten, in percentages 



percentage 
of pupils 




grade nine 
grade ten 



average 

score 



Mean of average score: grade nine - 16. 9 > 

Standard deviation of scores: grade nine - 3*3> 



grade ten - 18.8. 
grade ten - 3.9. 
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In the case of a pupil missing a test, or of the test being dis- 
qualified for technical reasons or by chance, the mean score of the 5 
tests will be computed and the following points (Table VIII) will be 
added or subtracted from the resultant score (rouqding to whole 
numbers will be done after the correction) • 



TABLE VIII . 

Number of points necessary to add or subtract to 
scores of pupils participating in 5 subtests 



Omitted 

aubtest 

Mathematics 

Science 

Reading Comprehension 
-literature- 

Reading Comprehension 
-social studies- 

Social Studies 
-general information- 

English Language 



procedure 

subtract 

subtract 

subtract 

subtract 

add 



9th grade 

.8 

.8 

.8 

1.0 

3.2 



10th grade 

.8 

1.1 

1.0 

1.2 

4.1 



The purpose of these corrections is to compensate for the differ- 
ences in the test means. For example, a pupil who missed the English 
test will lose more points than a pupil who missed the literature test. 
The corrections provide the differences between the mean (of the total 
sample) in 6 tests and the means (of the total sample) in 5 tests. 

The Table of norms is based on the means of 6 tests and if it is 
used to compute the mean in 5 tests a certain inaccuracy will result. 
The reliability of the mean of 5 tests is also affected to a certain 
extent. 

Where 2 or more tests are omitted the battery cannot provid# a 
basis for the construction of individual profiles. 



er|c 
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B. Instructions for Administer ing the Battery. 

1) The battery may be administered to the class as a whole or to 
individuals . 

2) The test will be administered by the class teacher. The presence 
of one other person is desirable. The pupils should be spaced out 






sufficiently. 

The administrator of the battery should arrange the technicalities 
in advance (proper and convenient seating order and space, sufficient 
supply of writing materials, etc.) since the test time is limited and 
technical hindrances during the test could disturb the time factor. 

3) The tests will be given in two sittings with a few days' interval 

between them. 



4) The administrator will address the following words of explanation 
to the pupils before they begin. The purpose is to reduce the tension 
of an examination and, at the same time, to arouse a serious attitude 
towards the battery. The explanation: 



Ve are soins to give 6 tests to the class, 3 today and 3 on. o . 

The tests cover specific knowledge in different subjects and general 
toowleSL ?ou wUl come across questions on subjects which you have 
not studied or read about. This is because the tests have also been 
desigLd for classes with a different school curriculum from 
However, you will find enough questions which you will have no diffic y 

in answering 



Not everybody will be able to finish the tests within the time 
allotted. If you finish before time is up you should check your answers 



After the tests are distributed we shall 
together. Any questions you may have connected with **'® 
will be answered. I will tell you when to open ***® * 
vou will all start together. If anyone has a question during the test 

he wlil raise his hand and I'll come to him. I'll 

Questions and will not answer questions connected with the material i 
the test. When I tell you to stop you will all put down your pens. 



The first Ijest is Science and after it has been distributed 
we *11 read the instructions together. 
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5) The test administrator will distribute the questionnaires with 
reply sheets. The pupils will fill in their names on the reply sheets. 

The administrator will read the instructions for the first test aloud. 

He will be strict about seeing that the whole class starts and stops together. 

6) Before starting the next tests the main points of the instructions 
will be repeated briefly as a reminder. It is also desirable to repeat 
the instructions before the second examination period. 



7) The order of the tests and the time allowed for each one. 



First examination period . 

1) Social Studies - general information 

2) Mathematics 

3) Reading Comprehension — literature 



15 minutes 
50 minutes 
30 minutes 



A total of 1.35 minutes. With the explanation and distribution less 
than 2 hours. 

The mathematics will be given straight after the social studies 
paper. Then it is worth having a 10 minute interval before the last 
test. Rough paper will be distributed for the mathematics test. 



Second examination period . 

4) Reading Comprehension - social studies 

5) Science 

6) English Language 



30 minutes 
30 minutes 
30 minutes 



A total of li hours. Vith the explanation and interval, less than 
2 hours o 



Science will come immediately after the reading comprehension test 
and then a 10 minute interval before the English paper. There is no 
reply sheet for the English test, answers are written on the questionnaire. 

8) The tests will be marked according to a correct answer sheet which 
will be supplied to the teachers. 

9) A pupil's score in each test is the number of correct answers. 

The mean score will be calculated on the basis of the total number of 
points scored in all 6 tests, divided by 6. Norms are given in Table VII. 



o 

ERIC 
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